Objective Evaluation Methods for Chinese Text-To-Speech Systems

نویسندگان

  • Teng Zhang
  • Zhipeng Chen
  • Ji Wu
  • Sam Lai
  • Wenhui Lei
  • Carsten Isert
چکیده

To objectively evaluate the performance of text-to-speech (TTS) systems, many studies have been conducted in the straightforward way to compare synthesized speech and natural speech with the alignment. However, in most situations, there is no natural speech can be used. In this paper, we focus on machine learning approaches for the TTS evaluation. We exploit a subspace decomposition method to separate different components in speech, which generates distinctive acoustic features automatically. Furthermore, a pairwise based Support Vector Machine (SVM) model is used to evaluate TTS systems. With the original prosodic acoustic features and Support Vector Regression model, we obtain a ranking relevance of 0.7709. Meanwhile, with the proposed oblique matrix projection method and pairwise SVM model, we achieve a much better result of 0.9115.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Total quality evaluation of speech synthesis systems

Based on the performance assessment of speech synthesis systems for Chinese the total quality evaluation of them has been carried out regular since 1994. The total quality evaluation includes speech intelligibility test at different levels (syllable, word and sentence), speech naturalness test and antiinterference ability test for phonetic module and text processing ability test for linguistic ...

متن کامل

A pair-based language model for the robust lexical analysis in Chinese text-to-speech synthesis

This paper presents a robust method of lexical analysis for Chinese text-to-speech (TTS) synthesis using a pair-based Language Model (LM). The traditional way of Chinese lexical analysis simply regards the word segmentation and part-of-speech (POS) tagging as two separated phases. Each of them utilizes its own algorithms and models. Actually, the POS information is useful for word segmentation,...

متن کامل

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

متن کامل

Modular Text-to-Speech Synthesis Evaluation for Mandarin Chinese

Proper evaluation can efficiently drive the development of text-tospeech (TTS) systems. The assessment is needed to determine how well a system or technique compares to others or how it compares with the previous version of the system. In order to obtain more useful feedback for the development, we do not only evaluate the whole system but also each module of the TTS system separately. Based on...

متن کامل

The Tdt-3 Text and Speech Corpus

The TDT-3 Text and Speech Corpus expands on previous phases of Topic Detection and Tracking data collections, by increasing the number of news sources being sampled, by including Mandarin Chinese as well as English news data, and by introducing new forms of topic annotation. In order to satisfy the specific data and annotation requirements of the TDT-3 Evaluation Plan[1], the LDC refined and su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016